Overview

Dataset statistics

Number of variables 9
Number of observations 767
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 54.1 KiB
Average record size in memory 72.2 B

Variable types

Numeric 8
Categorical 1

Warnings

6 has 111 (14.5%) zeros Zeros
72 has 35 (4.6%) zeros Zeros
35 has 227 (29.6%) zeros Zeros
0 has 373 (48.6%) zeros Zeros
33.6 has 11 (1.4%) zeros Zeros

Reproduction

Analysis started 2021-04-23 20:10:00.468176
Analysis finished 2021-04-23 20:10:16.372684
Duration 15.9 seconds
Software version pandas-profiling v2.12.0
Download configuration config.yaml

Variables

6
Real number (ℝ≥0)

ZEROS

Distinct 17
Distinct (%) 2.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 3.842242503
Minimum 0
Maximum 17
Zeros 111
Zeros (%) 14.5%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:16.504263 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 1
median 3
Q3 6
95-th percentile 10
Maximum 17
Range 17
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 3.370876524
Coefficient of variation (CV) 0.8773200862
Kurtosis 0.1612926278
Mean 3.842242503
Median Absolute Deviation (MAD) 2
Skewness 0.9039762644
Sum 2947
Variance 11.36280854
Monotocity Not monotonic
2021-04-24T01:40:16.588320 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
Value Count Frequency (%)
1 135
17.6%
0 111
14.5%
2 103
13.4%
3 75
9.8%
4 68
8.9%
5 57
7.4%
6 49
 
6.4%
7 45
 
5.9%
8 38
 
5.0%
9 28
 
3.7%
Other values (7) 58
7.6%
Value Count Frequency (%)
0 111
14.5%
1 135
17.6%
2 103
13.4%
3 75
9.8%
4 68
8.9%
Value Count Frequency (%)
17 1
 
0.1%
15 1
 
0.1%
14 2
 
0.3%
13 10
1.3%
12 9
1.2%

148
Real number (ℝ≥0)

Distinct 136
Distinct (%) 17.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 120.8591917
Minimum 0
Maximum 199
Zeros 5
Zeros (%) 0.7%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:16.717216 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 79
Q1 99
median 117
Q3 140
95-th percentile 181
Maximum 199
Range 199
Interquartile range (IQR) 41

Descriptive statistics

Standard deviation 31.97846846
Coefficient of variation (CV) 0.2645927713
Kurtosis 0.6429918315
Mean 120.8591917
Median Absolute Deviation (MAD) 20
Skewness 0.1764123623
Sum 92699
Variance 1022.622445
Monotocity Not monotonic
2021-04-24T01:40:16.843813 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
99 17
 
2.2%
100 17
 
2.2%
129 14
 
1.8%
125 14
 
1.8%
106 14
 
1.8%
111 14
 
1.8%
102 13
 
1.7%
95 13
 
1.7%
105 13
 
1.7%
108 13
 
1.7%
Other values (126) 625
81.5%
Value Count Frequency (%)
0 5
0.7%
44 1
 
0.1%
56 1
 
0.1%
57 2
 
0.3%
61 1
 
0.1%
Value Count Frequency (%)
199 1
 
0.1%
198 1
 
0.1%
197 4
0.5%
196 3
0.4%
195 2
0.3%

72
Real number (ℝ≥0)

ZEROS

Distinct 47
Distinct (%) 6.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 69.10169492
Minimum 0
Maximum 122
Zeros 35
Zeros (%) 4.6%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:16.991343 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 38.6
Q1 62
median 72
Q3 80
95-th percentile 90
Maximum 122
Range 122
Interquartile range (IQR) 18

Descriptive statistics

Standard deviation 19.36815466
Coefficient of variation (CV) 0.2802847988
Kurtosis 5.168577977
Mean 69.10169492
Median Absolute Deviation (MAD) 8
Skewness -1.841911017
Sum 53001
Variance 375.1254149
Monotocity Not monotonic
2021-04-24T01:40:17.145262 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
Value Count Frequency (%)
70 57
 
7.4%
74 52
 
6.8%
78 45
 
5.9%
68 45
 
5.9%
72 43
 
5.6%
64 43
 
5.6%
80 40
 
5.2%
76 39
 
5.1%
60 37
 
4.8%
0 35
 
4.6%
Other values (37) 331
43.2%
Value Count Frequency (%)
0 35
4.6%
24 1
 
0.1%
30 2
 
0.3%
38 1
 
0.1%
40 1
 
0.1%
Value Count Frequency (%)
122 1
 
0.1%
114 1
 
0.1%
110 3
0.4%
108 2
0.3%
106 3
0.4%

35
Real number (ℝ≥0)

ZEROS

Distinct 51
Distinct (%) 6.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 20.51760104
Minimum 0
Maximum 99
Zeros 227
Zeros (%) 29.6%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:17.303062 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 23
Q3 32
95-th percentile 44
Maximum 99
Range 99
Interquartile range (IQR) 32

Descriptive statistics

Standard deviation 15.95405906
Coefficient of variation (CV) 0.7775791637
Kurtosis -0.5183252996
Mean 20.51760104
Median Absolute Deviation (MAD) 12
Skewness 0.1120576816
Sum 15737
Variance 254.5320005
Monotocity Not monotonic
2021-04-24T01:40:17.476429 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 227
29.6%
32 31
 
4.0%
30 27
 
3.5%
27 23
 
3.0%
23 22
 
2.9%
18 20
 
2.6%
28 20
 
2.6%
33 20
 
2.6%
31 19
 
2.5%
19 18
 
2.3%
Other values (41) 340
44.3%
Value Count Frequency (%)
0 227
29.6%
7 2
 
0.3%
8 2
 
0.3%
10 5
 
0.7%
11 6
 
0.8%
Value Count Frequency (%)
99 1
0.1%
63 1
0.1%
60 1
0.1%
56 1
0.1%
54 2
0.3%

0
Real number (ℝ≥0)

ZEROS

Distinct 186
Distinct (%) 24.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 79.90352021
Minimum 0
Maximum 846
Zeros 373
Zeros (%) 48.6%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:17.788554 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 32
Q3 127.5
95-th percentile 293
Maximum 846
Range 846
Interquartile range (IQR) 127.5

Descriptive statistics

Standard deviation 115.2831052
Coefficient of variation (CV) 1.442778802
Kurtosis 7.205266456
Mean 79.90352021
Median Absolute Deviation (MAD) 32
Skewness 2.270630168
Sum 61286
Variance 13290.19433
Monotocity Not monotonic
2021-04-24T01:40:17.905373 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 373
48.6%
105 11
 
1.4%
140 9
 
1.2%
130 9
 
1.2%
120 8
 
1.0%
100 7
 
0.9%
180 7
 
0.9%
94 7
 
0.9%
115 6
 
0.8%
135 6
 
0.8%
Other values (176) 324
42.2%
Value Count Frequency (%)
0 373
48.6%
14 1
 
0.1%
15 1
 
0.1%
16 1
 
0.1%
18 2
 
0.3%
Value Count Frequency (%)
846 1
0.1%
744 1
0.1%
680 1
0.1%
600 1
0.1%
579 1
0.1%

33.6
Real number (ℝ≥0)

ZEROS

Distinct 248
Distinct (%) 32.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 31.9904824
Minimum 0
Maximum 67.1
Zeros 11
Zeros (%) 1.4%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:18.042669 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 21.8
Q1 27.3
median 32
Q3 36.6
95-th percentile 44.41
Maximum 67.1
Range 67.1
Interquartile range (IQR) 9.3

Descriptive statistics

Standard deviation 7.889090901
Coefficient of variation (CV) 0.2466074379
Kurtosis 3.282498397
Mean 31.9904824
Median Absolute Deviation (MAD) 4.6
Skewness -0.4279502476
Sum 24536.7
Variance 62.23775525
Monotocity Not monotonic
2021-04-24T01:40:18.195218 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
32 13
 
1.7%
31.6 12
 
1.6%
31.2 12
 
1.6%
0 11
 
1.4%
33.3 10
 
1.3%
32.4 10
 
1.3%
32.8 9
 
1.2%
30.8 9
 
1.2%
30.1 9
 
1.2%
32.9 9
 
1.2%
Other values (238) 663
86.4%
Value Count Frequency (%)
0 11
1.4%
18.2 3
 
0.4%
18.4 1
 
0.1%
19.1 1
 
0.1%
19.3 1
 
0.1%
Value Count Frequency (%)
67.1 1
0.1%
59.4 1
0.1%
57.3 1
0.1%
55 1
0.1%
53.2 1
0.1%

0.627
Real number (ℝ≥0)

Distinct 516
Distinct (%) 67.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.4716740548
Minimum 0.078
Maximum 2.42
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:18.353497 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 0.078
5-th percentile 0.1403
Q1 0.2435
median 0.371
Q3 0.625
95-th percentile 1.1333
Maximum 2.42
Range 2.342
Interquartile range (IQR) 0.3815

Descriptive statistics

Standard deviation 0.3314973556
Coefficient of variation (CV) 0.7028102399
Kurtosis 5.593373853
Mean 0.4716740548
Median Absolute Deviation (MAD) 0.166
Skewness 1.921190451
Sum 361.774
Variance 0.1098904968
Monotocity Not monotonic
2021-04-24T01:40:18.562574 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0.258 6
 
0.8%
0.254 6
 
0.8%
0.238 5
 
0.7%
0.261 5
 
0.7%
0.207 5
 
0.7%
0.268 5
 
0.7%
0.259 5
 
0.7%
0.245 4
 
0.5%
0.299 4
 
0.5%
0.687 4
 
0.5%
Other values (506) 718
93.6%
Value Count Frequency (%)
0.078 1
0.1%
0.084 1
0.1%
0.085 2
0.3%
0.088 2
0.3%
0.089 1
0.1%
Value Count Frequency (%)
2.42 1
0.1%
2.329 1
0.1%
2.288 1
0.1%
2.137 1
0.1%
1.893 1
0.1%

50
Real number (ℝ≥0)

Distinct 52
Distinct (%) 6.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 33.2190352
Minimum 21
Maximum 81
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 6.1 KiB
2021-04-24T01:40:18.718448 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum 21
5-th percentile 21
Q1 24
median 29
Q3 41
95-th percentile 58
Maximum 81
Range 60
Interquartile range (IQR) 17

Descriptive statistics

Standard deviation 11.7522956
Coefficient of variation (CV) 0.3537819665
Kurtosis 0.6608718486
Mean 33.2190352
Median Absolute Deviation (MAD) 7
Skewness 1.135164695
Sum 25479
Variance 138.1164518
Monotocity Not monotonic
2021-04-24T01:40:18.849820 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
22 72
 
9.4%
21 63
 
8.2%
25 48
 
6.3%
24 46
 
6.0%
23 38
 
5.0%
28 35
 
4.6%
26 33
 
4.3%
27 32
 
4.2%
29 29
 
3.8%
31 24
 
3.1%
Other values (42) 347
45.2%
Value Count Frequency (%)
21 63
8.2%
22 72
9.4%
23 38
5.0%
24 46
6.0%
25 48
6.3%
Value Count Frequency (%)
81 1
0.1%
72 1
0.1%
70 1
0.1%
69 2
0.3%
68 1
0.1%

1
Categorical

Distinct 2
Distinct (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory size 6.1 KiB
0
500 
1
267 

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 767
Distinct characters 2
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 0
2nd row 1
3rd row 0
4th row 1
5th row 0
Value Count Frequency (%)
0 500
65.2%
1 267
34.8%
2021-04-24T01:40:19.076645 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-24T01:40:19.141953 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Value Count Frequency (%)
0 500
65.2%
1 267
34.8%

Most occurring characters

Value Count Frequency (%)
0 500
65.2%
1 267
34.8%

Most occurring categories

Value Count Frequency (%)
Decimal Number 767
100.0%

Most frequent character per category

Value Count Frequency (%)
0 500
65.2%
1 267
34.8%

Most occurring scripts

Value Count Frequency (%)
Common 767
100.0%

Most frequent character per script

Value Count Frequency (%)
0 500
65.2%
1 267
34.8%

Most occurring blocks

Value Count Frequency (%)
ASCII 767
100.0%

Most frequent character per block

Value Count Frequency (%)
0 500
65.2%
1 267
34.8%

Interactions

2021-04-24T01:40:02.370199 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:02.767133 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:02.937943 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:03.127723 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:03.345221 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:03.660673 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:03.839240 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:04.098771 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:04.833178 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:05.244413 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:05.632836 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:05.863818 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.097153 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.254914 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.396856 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.540957 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.705559 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.834444 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:06.977387 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.122069 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.243889 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.385127 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.538148 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.708520 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:07.912311 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.181352 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.325838 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.467650 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.619373 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.737606 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.852068 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:08.998101 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:09.174957 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:09.326797 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.237726 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.368093 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.588211 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.715990 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.839559 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:13.962785 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.083871 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.178271 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.304679 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.456455 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.589341 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.691434 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.775832 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.873916 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:14.966844 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.102797 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.240139 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.358036 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.492728 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.608951 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.717265 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
2021-04-24T01:40:15.831006 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-04-24T01:40:19.210038 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-24T01:40:19.356070 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-24T01:40:19.501366 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-24T01:40:19.768905 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-04-24T01:40:16.073729 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-24T01:40:16.273525 image/svg+xml Matplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

6 148 72 35 0 33.6 0.627 50 1
0 1 85 66 29 0 26.6 0.351 31 0
1 8 183 64 0 0 23.3 0.672 32 1
2 1 89 66 23 94 28.1 0.167 21 0
3 0 137 40 35 168 43.1 2.288 33 1
4 5 116 74 0 0 25.6 0.201 30 0
5 3 78 50 32 88 31.0 0.248 26 1
6 10 115 0 0 0 35.3 0.134 29 0
7 2 197 70 45 543 30.5 0.158 53 1
8 8 125 96 0 0 0.0 0.232 54 1
9 4 110 92 0 0 37.6 0.191 30 0

Last rows

6 148 72 35 0 33.6 0.627 50 1
757 1 106 76 0 0 37.5 0.197 26 0
758 6 190 92 0 0 35.5 0.278 66 1
759 2 88 58 26 16 28.4 0.766 22 0
760 9 170 74 31 0 44.0 0.403 43 1
761 9 89 62 0 0 22.5 0.142 33 0
762 10 101 76 48 180 32.9 0.171 63 0
763 2 122 70 27 0 36.8 0.340 27 0
764 5 121 72 23 112 26.2 0.245 30 0
765 1 126 60 0 0 30.1 0.349 47 1
766 1 93 70 31 0 30.4 0.315 23 0